skip to main content


Search for: All records

Creators/Authors contains: "Xu, Pengfei"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract

    Catalyst impregnation is the first step and one of the most crucial steps for preparing industrial catalysts. The process is typically performed in rotating vessels with a spray-nozzle to distribute the liquid onto porous catalyst supports until the pore volume is reached. The inter-particle variability of the impregnated liquid inside the particles significantly affects the activity and selectivity of the resulting catalyst. Current scale-up practices lead to poor fluid distribution and inhomogeneity in the liquid content. The aim of this work is to understand the dynamic behavior of the particles under the spray nozzle, which is essential for desired content uniformity, and to develop a scale-up model for the dry impregnation process. In this work, we considered four dimensionless numbers in the scaling analysis. The scale-up rules require that the dimensionless numbers are kept constant for different scales. Both DEM simulations and matching experiments of dry impregnation inside the porous particles were performed for different vessel sizes. The water content of the particles was compared for different times and locations, and the relative standard deviation is calculated from the axial water content. Simulation and experimental results show that particles achieve similar content uniformity at the end of impregnation, confirming that the scale-up rules are applicable to all vessel sizes. The dimensionless numbers give very good scale-up performance since curves collapse indicating similarity in the processes. In addition, the scale-up method is validated for different particle sizes in simulations.

    Graphical abstract 
    more » « less
  2. The record-breaking performance of deep neural networks (DNNs) comes with heavy parameter budgets, which leads to external dynamic random access memory (DRAM) for storage. The prohibitive energy of DRAM accesses makes it nontrivial for DNN deployment on resource-constrained devices, calling for minimizing the movements of weights and data in order to improve the energy efficiency. Driven by this critical bottleneck, we present SmartDeal, a hardware-friendly algorithm framework to trade higher-cost memory storage/access for lower-cost computation, in order to aggressively boost the storage and energy efficiency, for both DNN inference and training. The core technique of SmartDeal is a novel DNN weight matrix decomposition framework with respective structural constraints on each matrix factor, carefully crafted to unleash the hardware-aware efficiency potential. Specifically, we decompose each weight tensor as the product of a small basis matrix and a large structurally sparse coefficient matrix whose nonzero elements are readily quantized to the power-of-2. The resulting sparse and readily quantized DNNs enjoy greatly reduced energy consumption in data movement as well as weight storage, while incurring minimal overhead to recover the original weights thanks to the required sparse bit-operations and cost-favorable computations. Beyond inference, we take another leap to embrace energy-efficient training, by introducing several customized techniques to address the unique roadblocks arising in training while preserving the SmartDeal structures. We also design a dedicated hardware accelerator to fully utilize the new weight structure to improve the real energy efficiency and latency performance. We conduct experiments on both vision and language tasks, with nine models, four datasets, and three settings (inference-only, adaptation, and fine-tuning). Our extensive results show that 1) being applied to inference, SmartDeal achieves up to 2.44x improvement in energy efficiency as evaluated using real hardware implementations and 2) being applied to training, SmartDeal can lead to 10.56x and 4.48x reduction in the storage and the training energy cost, respectively, with usually negligible accuracy loss, compared to state-of-the-art training baselines. Our source codes are available at: https://github.com/VITA-Group/SmartDeal. 
    more » « less
  3. (Frankle & Carbin, 2019) shows that there exist winning tickets (small but critical subnetworks) for dense, randomly initialized networks, that can be trained alone to achieve comparable accuracies to the latter in a similar number of iterations. However, the identification of these winning tickets still requires the costly train-prune-retrain process, limiting their practical benefits. In this paper, we discover for the first time that the winning tickets can be identified at the very early training stage, which we term as Early-Bird (EB) tickets, via low-cost training schemes (e.g., early stopping and low-precision training) at large learning rates. Our finding of EB tickets is consistent with recently reported observations that the key connectivity patterns of neural networks emerge early. Furthermore, we propose a mask distance metric that can be used to identify EB tickets with low computational overhead, without needing to know the true winning tickets that emerge after the full training. Finally, we leverage the existence of EB tickets and the proposed mask distance to develop efficient training methods, which are achieved by first identifying EB tickets via low-cost schemes, and then continuing to train merely the EB tickets towards the target accuracy. Experiments based on various deep networks and datasets validate: 1) the existence of EB tickets and the effectiveness of mask distance in efficiently identifying them; and 2) that the proposed efficient training via EB tickets can achieve up to 5.8x ~ 10.7x energy savings while maintaining comparable or even better accuracy as compared to the most competitive state-of-the-art training methods, demonstrating a promising and easily adopted method for tackling cost-prohibitive deep network training. 
    more » « less
  4. Convolutional neural networks (CNNs) have been increasingly deployed to edge devices. Hence, many efforts have been made towards efficient CNN inference on resource-constrained platforms. This paper attempts to explore an orthogonal direction: how to conduct more energy-efficient training of CNNs, so as to enable on-device training? We strive to reduce the energy cost during training, by dropping unnecessary computations, from three complementary levels: stochastic mini-batch dropping on the data level; selective layer update on the model level; and sign prediction for low-cost, low-precision back-propagation, on the algorithm level. Extensive simulations and ablation studies, with real energy measurements from an FPGA board, confirm the superiority of our proposed strategies and demonstrate remarkable energy savings for training. For example, when training ResNet-74 on CIFAR-10, we achieve aggressive energy savings of >90% and >60%, while incurring a top-1 accuracy loss of only about 2% and 1.2%, respectively. When training ResNet-110 on CIFAR-100, an over 84% training energy saving is achieved without degrading inference accuracy. 
    more » « less
  5. Whereas the gill chambers of jawless vertebrates open directly into the environment, jawed vertebrates evolved skeletal appendages that drive oxygenated water unidirectionally over the gills. A major anatomical difference between the two jawed vertebrate lineages is the presence of a single large gill cover in bony fishes versus separate covers for each gill chamber in cartilaginous fishes. Here, we find that these divergent patterns correlate with the pharyngeal arch expression of Pou3f3 orthologs. We identify a deeply conserved Pou3f3 arch enhancer present in humans through sharks but undetectable in jawless fish. Minor differences between the bony and cartilaginous fish enhancers account for their restricted versus pan-arch expression patterns. In zebrafish, mutation of Pou3f3 or the conserved enhancer disrupts gill cover formation, whereas ectopic pan-arch Pou3f3b expression generates ectopic skeletal elements resembling the multimeric covers of cartilaginous fishes. Emergence of this Pou3f3 arch enhancer >430 Mya and subsequent modifications may thus have contributed to the acquisition and diversification of gill covers and respiratory strategies during gnathostome evolution.

     
    more » « less